Cross-Lingual Information to the Rescue in Keyword Extraction

نویسندگان

  • Chung-Chi Huang
  • Maxine Eskénazi
  • Jaime G. Carbonell
  • Lun-Wei Ku
  • Ping-Che Yang
چکیده

We introduce a method that extracts keywords in a language with the help of the other. In our approach, we bridge and fuse conventionally irrelevant word statistics in languages. The method involves estimating preferences for keywords w.r.t. domain topics and generating cross-lingual bridges for word statistics integration. At run-time, we transform parallel articles into word graphs, build cross-lingual edges, and exploit PageRank with word keyness information for keyword extraction. We present the system, BiKEA, that applies the method to keyword analysis. Experiments show that keyword extraction benefits from PageRank, globally learned keyword preferences, and cross-lingual word statistics interaction which respects language diversity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Question Answering Using Common Semantic Space

With the advent of Big Data concept, a lot of attention has been paid to structuring and giving semantic to this data. Knowledge bases like DBPedia play an important role to achieve this goal. Question answering systems are common approach to address expressivity and usability of information extraction from knowledge bases. Recent researches focused only on monolingual QA systems while cross-li...

متن کامل

JAVELIN III: Cross-Lingual Question Answering from Japanese and Chinese Documents

In this paper, we describe the JAVELIN Cross Language Question Answering system, which includes modules for question analysis, keyword translation, document retrieval, information extraction and answer generation. In the NTCIR6 CLQA2 evaluation, our system achieved 19% and 13% accuracy in the English-to-Chinese and English-to-Japanese subtasks, respectively. An overall analysis and a detailed m...

متن کامل

MIKE: An Interactive Microblogging Keyword Extractor using Contextual Semantic Smoothing

Social media, such as tweets on Twitter and Short Message Service (SMS) messages on cellular networks, are short-length textual documents (short texts or microblog posts) exchanged among users on the Web and/or their mobile devices. Automatic keyword extraction from short texts can be applied in online applications such as tag recommendation and contextual advertising. In this paper we present ...

متن کامل

Exploiting Knowledge Bases for Multilingual and Cross-lingual Semantic Annotation and Search

The amount of entities in large knowledge bases (KBs) has been increasing rapidly, making it possible to propose new ways of intelligent information access. In addition, there is an impending need for systems that can enable multilingual and cross-lingual information access. In this work, we firstly demonstrate X-LiSA, an infrastructure for multilingual and cross-lingual semantic annotation, wh...

متن کامل

A Knowledge Base Approach to Cross-Lingual Keyword Query Interpretation

The amount of entities in large knowledge bases available on the Web has been increasing rapidly, making it possible to propose new ways of intelligent information access. In addition, there is an impending need for technologies that can enable cross-lingual information access. As a simple and intuitive way of specifying information needs, keyword queries enjoy widespread usage, but suffer from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014